CLAIMS 

Having thus described our invention, what we claim as new, and desire to secure by 
Letters Patent is: 





1 


1 . A multistage microprocessor pipeline structure for executing 




2 


processing instructions comprising: 




3 


an instruction cache, an instruction buffer, a decoder, a register file, an 




4 


arithmetic logic unit, and a cache memory, wherein to start execution of an instruction. 




5 


the instruction is fetched from the instruction cache and loaded into the instruction 




6 


buffer, and the instruction is then decoded for the operation by the decoder, and 




7 


corresponding register values for execution of the instruction are read from the register 


'M 


8 


file and are input to the arithmetic logic unit which executes the instruction; 


isi 


9 


width determination logic receives outputs from the decoder arid 


i n 
\m 


10 


determines a minimum effective processing operation width for executing each 




11 


processing instruction and propagates width control data along with data for execution 


M 


12 


of the instruction through the pipeline structure for executing the processing 


U1 


13 


instruction; 


i m 


14 


the microprocessor pipeline structure comprises a plurality of reduced 




15 


bit width slices for execution of the instruction, wherein each slice comprises a 


h4 


16 


reduced bit width portion of the register file, a reduced bit width portion of the 




17 


arithmetic logic unit, and a reduced bit width portion of the cache memory, with a data 




18 


carry operation proceeding from a lesser significant slice to a more significant slice, 




19 


and the slices all operate in parallel when a fiiU bit width processing operation is 




20 


executed, or only a minimum required numbers of slice is enabled if the width of the 




21 


processing operation is determined to be narrower than a fiill bit width processing 




22 


operation, and different slices are enabled and process data on a cycle-by-cycle basis. 
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2. The multistage microprocessor pipeline structure of claim 1, 
wherein the width determination logic uses data about the length of operands stored in 
a register file tags module that stores value bit information about each operand in the 
register file, including the sign and width of each operand, and one or more leading 
bits of one or more bytes of each operand, which are examined for data overflow from 
a lesser significant slice to a more significant slice. 

3. The multistage microprocessor pipeline structure of claim 2, 
wherein the value bit information includes a sign of an operand in one bit, a register 
data width in bytes of the operand value in two bits, and one or more leading bits of 
one or more of the most significant bytes of the operand. 

4. The multistage microprocessor pipeline structure of claim 2, 
wherein the output of the decoder indicates two source registers and one destination 
register, and an instruction operation code, and the register file tags module and the 
instruction operation code are used to determine the number of slices required for 
executing the corresponding processing instruction, and those number of slices are 
enabled in subsequent cycles in the pipeline structure. 

5. The multistage microprocessor pipeline structure of claim 1, 
wherein a cache tag file stores addresses of the cache memory to write to and read 
from, and a width tag file stores the width of data stored in each memory address. 

6. The multistage microprocessor pipeline structure of claim 1, 
wherein the width determination logic outputs the width control bits to enable data 
flow and computation in the slices. 
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1 7. The multistage microprocessor pipeline structure of claim 1, 

2 wherein enabling and disabling of each slice is accomplished by clock gating, where 

3 during enabling, clock signals allow data to proceed into and through a slice, and 

4 during disabling, clock signals block the flow of data into and through a slice, 

1 8. The multistage microprocessor pipeline structure of claim 2, 

2 wherein the width determination logic determines the likelihood of a data overflow 

3 being generated from a narrow slice operation by examining one or more leading bits 

4 of the operands which are stored in the register file tags module and generates one of 

5 three determinations: 

6 no data overflow is guaranteed, and the effective operation width is 

i 

7 determined by the width of the narrow operands; 

iJI 8 data overflow is guaranteed, and the effective operation width must be 

IH 

ki 9 one byte larger than the width of the narrow operands; 

i If 

I'^K 10 data overflow is possible but not certain, wherein a carry into the bits 

i= 1 1 examined is propagated as a carry out. 

P 

j"'^ 1 9. The multistage microprocessor pipeline structure of claim 1, 

& 2 wherein following execution and completion of a processing operation by the 

3 arithmetic logic unit, the width of the value of the processing operation result is 

4 determined, after which the width determination logic determines value bit 

5 information for the processing operation result by combining its sign bit, value width 

6 and one or more leading bits for its one or more leading bytes, which is then written to 

7 a destination register in the register file. 

1 1 0. A method for reducing logic activity in the execution of an 

2 operation in a processor comprising the steps of: 

3 selecting at least one operand associated with said operation, 
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looking up a width and a value of selected bits of said at least one 

operand, 

determining a prediction of arithmetic overflow, based upon the width 
and the value of said selected bits of said at least one operand, 

determining an effective width of said operation based upon the width 
of said at least one operand, a function specified by said operation, and said prediction 
of arithmetic overflow, 

enabling the width of the resources in said processor corresponding to 
said effective width of said operation for executing said operation, 

executing said operation, 

determining the width of the result of said operation based upon the 
step of executing. 

11. The method of claim 10, including saving the width of the result of 
said operation, and saving said result of said operation. 

12. The method of claim 10, wherein said step of looking up a width 
and a value of selected bits includes dedicated hardware for holding and retrieving 
said width and said value of selected bits. 

13. The method of claim 10, wherein the processor includes a register 
file, an arithmetic unit, a memory path, and a cache memory, and the register file, the 
arithmetic unit, and the cache memory are divided into a plurality of slices, each of 
which is of a reduced bit granularity, and the bits in all of the slice form a full width 
word in the processor. 

14. The method of claim 13, wherein at least one slice is of 8 bit 

granularity. 
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1 15. The method of claim 13, wherein at least one slice is of 16 bit 

2 granularity. 

1 16. The method of claim 13, wherein said step of enabling includes 

2 logic to enable a required number of sHces to execute the operation. 

1 17. A processor comprising: 

2 a plurality of slices, each of which is a portion of a full width word of 

3 the processor, wherein each slice comprises a portion of a register file, a portion of 

4 functional units, a portion of a memory path, a portion of a cache memory, and a 
Q 5 portion of other resources required to perform operations in the processor, 



6 logic to save and retrieve a width and selected bits of operands used to 

[1 7 perform an operation in the processor, 

8 . logic to determine a prediction of arithmetic overflow when performing 

T'^ 9 the operation, based upon the width and the selected bits of the operands used to 

1 0 perform the operation, 

Q 

iji 1 1 logic to determine a number of slices required to perform the operation 

12 based upon the width of one or more operands, the functionality of the operation, and 

y1 

Q 1 3 the prediction of arithmetic overflow, 

14 logic to activate the slices required to perform the operation. 



1 1 8. The processor of claim 17, including logic to determine the width 

2 of the result of the operation, 

3 circuitry to store said width and selected bits of said result, and 

4 circuitry to store said result. 
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